time-lapse video
Bench to Time lapseVideoGeneration
The emergence of large-scale text-to-image models [92, 60, 59, 58, 42, 5, 94, 14, 54, 40] has significantly advanced the field of Text-to-Video (T2V) generation [66,6,7,21,73,90]. Existing T2V architectures can be categorized into two types: U-Net-based and DiT-based. The latter focuses on recreating open-source structures similar to Sora [9], using the DiT (Diffusion-Transformer) [57]frameworkforT2Vgeneration [43,95,93,20]. When calculating theMTScore, thevideo retrievalmodel uses these texts toevaluate each frame ofthe video, assigning probabilities based on the matches. The final result is obtained by summing the general probability and the metamorphic probability.
- North America > United States (0.04)
- Asia > Singapore (0.04)
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > United States (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- (2 more...)
- North America > United States (0.04)
- Asia > Singapore (0.04)
- Asia > China (0.04)
- Information Technology (0.68)
- Consumer Products & Services (0.46)
- Energy (0.46)
- Europe > Switzerland > Zürich > Zürich (0.14)
- Asia > Singapore (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- (4 more...)
- Information Technology > Data Science (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation
Yuan, Shenghai, Huang, Jinfa, Xu, Yongqi, Liu, Yaoyang, Zhang, Shaofeng, Shi, Yujun, Zhu, Ruijie, Cheng, Xinhua, Luo, Jiebo, Yuan, Li
We propose a novel text-to-video (T2V) generation benchmark, ChronoMagic-Bench, to evaluate the temporal and metamorphic capabilities of the T2V models (e.g. Sora and Lumiere) in time-lapse video generation. In contrast to existing benchmarks that focus on the visual quality and textual relevance of generated videos, ChronoMagic-Bench focuses on the model's ability to generate time-lapse videos with significant metamorphic amplitude and temporal coherence. The benchmark probes T2V models for their physics, biology, and chemistry capabilities, in a free-form text query. For these purposes, ChronoMagic-Bench introduces 1,649 prompts and real-world videos as references, categorized into four major types of time-lapse videos: biological, human-created, meteorological, and physical phenomena, which are further divided into 75 subcategories. This categorization comprehensively evaluates the model's capacity to handle diverse and complex transformations. To accurately align human preference with the benchmark, we introduce two new automatic metrics, MTScore and CHScore, to evaluate the videos' metamorphic attributes and temporal coherence. MTScore measures the metamorphic amplitude, reflecting the degree of change over time, while CHScore assesses the temporal coherence, ensuring the generated videos maintain logical progression and continuity. Based on the ChronoMagic-Bench, we conduct comprehensive manual evaluations of ten representative T2V models, revealing their strengths and weaknesses across different categories of prompts, and providing a thorough evaluation framework that addresses current gaps in video generation research. Moreover, we create a large-scale ChronoMagic-Pro dataset, containing 460k high-quality pairs of 720p time-lapse videos and detailed captions ensuring high physical pertinence and large metamorphic amplitude.
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > California > Santa Cruz County > Santa Cruz (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- (3 more...)
- Consumer Products & Services (0.46)
- Transportation (0.46)
MIT researchers train AI to predict how humans paint works of art
MIT researchers have created an AI tool capable of generating time-lapse videos that predict how human artists use their hands to create watercolor or digital paintings. The AI is trained using time-lapse videos of people making art on Vimeo and YouTube. The probabilistic model can synthesize and predict moments in the painting process from just a single image of an artwork. The network is meant to mimic the ability skilled human artists possess to see a piece of art and comprehend the series of brush strokes or steps a person took to put it together. There are often many possible ways to create a given painting.
- Law > Statutes (0.55)
- Government (0.34)
- Information Technology > Communications > Social Media (0.61)
- Information Technology > Artificial Intelligence > Applied AI (0.40)
Sydney Startup Uses AI to Improve IVF Success Rate NVIDIA Blog
In vitro fertilization, a common treatment for infertility, is a lengthy undertaking for prospective parents, involving ultrasounds, blood tests and injections of fertility medications. If the process doesn't end up in a successful pregnancy -- which is often the case -- it can be a major emotional and financial blow. Sydney-based healthcare startup Harrison.ai is using deep learning to improve the odds of success for thousands of IVF patients. Its AI model, IVY, is used by Virtus Health, a global provider of assisted reproductive services, to help doctors evaluate which embryo candidate has the best chance of implantation into the patient. Founded by brothers Aengus and Dimitry Tran in 2017, Harrison.ai
- Health & Medicine > Therapeutic Area > Obstetrics/Gynecology (0.56)
- Information Technology > Hardware (0.45)